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ABSTRACT 

We perform an extensive analysis of the CivA1549 line in three large spectro- 
scopic surveys of quasars. Differing approaches for fitting the C iv line can be found 
in the literature, and we compare the most common methods to highlight the relative 
systematics associated with each. We choose the line fitting procedure that results in 
a symmetric profile for the C iv line and gives accurate fits to local emission features 
around the line, and use this approach to measure the width of the C iv line in spectra 
from the SDSS, 2QZ and 2SLAQ surveys. 

The results are compared with a previous study of the Mg ii A2799 line in the same 
sample. We find the Civ line tends to be broader than the Mgii line in spectra that 
have both lines, and the average ratio between the lines is consistent with a simplistic 
model for a photoionised, virialised and stratified broad-line region. There exists a 
statistically significant correlation between the widths of the Civ and Mgii lines. 
However, the correlation is weak, and the scatter around a best fit is only marginally 
less than the full dynamic range of line widths. 

Motivated by previous work on the Mgii line, we examine the dispersion in the 
distribution of CiV line widths. We find that the dispersion in Civ line widths is 
essentially independent of both redshift and luminosity. This result is in stark con- 
trast to the Mgii line, which shows a strong luminosity dependence. Furthermore we 
demonstrate that the low level of dispersion in Civ line width (~ 0.08 dex) is incon- 
sistent with a pure-disk model for the emitting region and use our data to constrain 
simple models for the broad-line region. 

Finally we consider our results in terms of their implications for the the virial 
technique for estimating black hole masses. The inconsistency between Mgii and Civ 
line widths in single spectra, combined with the differing behaviour of the Mg ii and 
C IV line width distributions as a whole, indicates that there must be an inconsistency 
between Mgii and Civ virial mass estimators. Furthermore, the level of intrinsic 
dispersion in Mg ii and C iv line widths contributes less dynamic range to virial mass 
estimates than the error associated with the estimates. The indication is that the 
line width term in these UV virial mass estimators may be essentially irrelevant with 
respect to the typical uncertainty on a mass estimate. 

Key words: galaxies: quasars: general - quasars: emission lines 



1 INTRODUCTION 

This paper describes an extensive analysis of the ClvA1549 
stephen.fine@durham.ac.uk jjjjg width distribution in three large spectroscopic samples 
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of quasi-stellar objects (QSOs). Civ has the highest ionisa- 
tion potential of any of the strong broad emission lines in 
QSO spectra and is the best probe of the high-ionisation 
inner regions of the broad- line region (BLR). Furthermore, 
like H/3 and MgiiA2799, the Civ line is commonly used 
to calculate super-massive black hole (SMBH) masses for 
QSOs. The line is observed in optical spectra with redshifts 
between ~1.5 and 5 (corresponding to a look-back time of 
9 to 12Gyr), and is the only line used to calculate SMBH 
masses in the highest redshift objects. 

The redshift range between ~1.5 and 5 is of particu- 
lar importance to quasar astrophysics since it spans the so 
called 'quasar epoch' at z ~ 2 to 3. Before z ^ 2, the space 
densi ty of QSOs has been observed to increase with tim e 
(e.g. lOsmeJ Il982l : iFan. et al.l l200ll : iRichards. et al.l |2006|) 
but since z ~ 2 it has fall en (e.g. Longair 19661: ISchmidtl 
ll968l : ICroom et al . 2004; Ri chards, et al. 2006). The period 
between z ^ 2 and 3 marks the peak of quasar activity 
in the Universe. Understanding the processes which caused 
this ramping up of activity in the early Universe and what 
is responsible for its reversal is a major goal in QSO science. 
The fact that the C iv line is visible in optical spectra over 
this entire range potentially makes it an attractive probe of 
QSOs at these epochs. 

In section[2]we review virial SMBH mass estimation and 
the use of the C iv line with this technique. Section |3] gives 
a brief description of the three datasets used in this work. 
In section |4] we present the results of our fitting and make a 
comparison with a previous analysis of the Mg ll line in the 
same sample of quasars from iFine. et al.l (|2008r ) . Motivated 
by the results in iFine. et al.l (2003), section [5] describes an 
investigation of the dispersion in C iv line widths as a func- 
tion of redshift and luminosity. In sections|S]and[7]we discuss 
our results with respect to virial SMBH mass estimation and 
BLR geometry. Details of the analytic procedure, including 
a discussion of our line fitting analysis and a prescription 
for removing broad-absorption lines (BALs) from our data, 
can be found in appendices \K\ and iBl respectively. Through- 
out this paper we assume a flat (fim,^A) = (0.3,0.7), 
Ho = 70kms~^ Mpc~^ cosmology. 



2 C IV AND VIRIAL SMBH MASS 
ESTIMATION 

The most common approach for measuring the mass of QSO 
SMB Hs is through studies of th e BLR. Reverberation map- 
ping l|Blandford fc McKedfigS^ ) allows the size of the BLR 
to be derived through studying the time lag between contin- 
uum and broad-line variability in QSO spectra. Combining 
an estimate for the size of the BLR with an assumption that 
the BLR is viri alised, the mass of the central SMBH can be 
estimated (e.g. IPeterson. et al.ll2004) . Given a QSO with a 
BLR of radius tblr and virial velocity Vblr (estimated from 
the width of an emission line) , the central mass is given by 



A/b 



/ 



J'BLRyBLR 



(1) 



Here the factor / is defined by the geometry and ori- 
entation of the B L R w hic h are unknown ( see e.g . 
Peterson fc Wandell [19991: [McLure fc DunJopI I2001J : 



Reverberation mapping requires observations over an 
extended period of time and as a consequence only a few 
tens of systems have been adequately studied in this fashion 
( Kaspi et al. 2000 ; Peterson, et al. 2004 ). However, in recent 
years a technique for estimating SMBH masses from single 
epoch spectra has been developed: the 'virial' method. 

The virial technique for estimating SMBH masses is 
based on the radius-luminosity relation me asured for the 
H/3 B LR in reverberation mapped systems (jWandel et al.l 
[l999|). This tight correlation between the continuum lumi- 
nosity of Seyfert Is and the H/3 BLR size allows for single 
epoch empirical estimation of the BLR size. The estimated 
radius is combined with the velocity width of the H/3 line to 
give a virial SMBH mass estimate. 

The virial mass is estimated with a relation of the form 



AfBH = A{\LxTFWHM'^ 



(2) 



Collin et al.l l2006l : iLabita et al.l 120061 for discussions on 
the value of /). 



where FWHM is the full width at half maximum of the H/3 
line and L\ is the monochromatic luminosity of the contin- 
uum at wavelength A (taken near the line), A is a normal- 
isation constant, and the exponent a gives the luminosity 
dependence of the radius-luminosity relation. 

While the radius-luminosity relation is only well estab- 
lished for the H/3 BL R, there is growing evidence for a sim- 
ilar relation for Civ (jKaspi et al.ll2007^ . Assuming the ex- 
istence of equivalent radius-luminosity relations secondary 
virial mass estimators based on other emission lines have 
been calibrated. Most commonly the Mgll or Civ lines are 
used as they are strong, relatively unblended features and 
are evident in optical spectra of progressively higher redshift 
objects. The relations for these lines also take the form of 
equation[2]where the FWHM is measured from the new line, 
and the continuum luminosity is taken in the vicinity of that 
new line. The quantities A and a for these secondary virial 
estimators are generally calibrated against SMBH masses 
measured for the same sources from the H/3 line. These sec- 
ondary virial calibrations have been shown to be consistent 
with H^ virial and reverberation mapping masses to within 
~ 0. 3 dex over sever al orders of magnitude in SMBH mass 
(e.g. iMcLure fc Jarvia.2002 : iVestergaard 2002 ). 

This paper is primarily concerned with the C iv 
line and, while the Civ line is f requently used to 
calculate SMBH masses fo r QSOs l|Vestergaardl |2002| : 
'Vesterga ard fc Petersonll2006l '). there has been some contro- 
versy in the literature as to how well suited C iv is for this 
sort of analysis. Most of the debate has focused on the fact 
that C IV is a considerably higher ionisation line than H/3 or 
Mgll, and he nce could be emitted fro m a different part of 
the BLR (e.g. lOnken fc Petersonll2002l '). In addition, the C iv 
line profile is known to display asymmetries, due primarily to 
absorption in both wings of the line. The C iv line also tends 
to be blueshifted with respect to lower ionisation li nes and 
narro w lines in QSO spectra ({Gaskcll 1982; Richard s, et al.l 
[2003) indicating potentially differing dyn ami c s for the low 
and high-ionisation BLR. Finally IS hen et al.l (|2008l ) found 
that, in spectra with both a Mgll and Civ line, there is 
little-to-no correlation between the width of the two lines. 
IShen et al.l (|2008l ') went on to conclude that, while virial 
mass estimators based on the Mg 11 and C IV lines can be 
inconsistent in individual objects, results averaged over a 
population are consistent. 

Civ virial SMBH mass estimates have been shown 
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to be consistent with those from other hnes as well as 
for re verberation mapped objects IjVestergaard fc PetersonI 
l200a ). In this paper we will generally assume that Civ 
can be used as a virial mass estimator and derive 
our re sults accordingly (see e.g. IVestergaard fc PetersonI 
20061: iGa vignaud . et all |2008| and iBaskin fc Laod bOOsT 



average flu x calibration f or th e 300B grating has been cal- 
culated by I Lewis, et al.l (|2002l ) as part of the 2dF Galaxy 
Redshift Survey; in our analysis we apply this correction to 
the spectra. 



Netzer et al. 2007; ISulentic et al.l |2007| for discussions for 3.3 Analysis 



and against against this assumption) . In section |6| we dis- 
cuss the implications of our own findings with respect to the 
virial assumption and SMBH mass estimation. 



3 DATA AND ANALYSIS 



We take the same sample studied in iFine. et al.l l|200g ). 
This comprises of all of the quasar spectra fr om the Sloan 
Digital Sk y Survey (SDSS: lYork. et all bOOOt) data release 
five (DR5; Adelman-McGarthv. et al.ll2007l ) (as compiled by 
Schneider, etal.ll2007^ . the 2dF QSO Redshift survey (2QZ; 
Groom et al.ll2004l) and 2dF SPSS LRG And QS O survey 
(2SLAQ: 'Ric hards, et al.ll2005l : lGroom. et al' 20091. Table|T] 
shows a brief summary of the number of objects and mag- 
nitude limits in each sample. As one moves down the table 
each successive survey has fewer spectra, but fainter fiux 
limits. Increasing our flux coverage allows for a more de- 
tailed study of any luminosity effects on the G iv line width 
distribution. 



3.1 SDSS spectra 

Details of th e SDS S tele scope and spectrograph are given in 
iGunn. et af] ([_2006) and'Stought on. et all |2002). The spec- 
tra have a logarithmic wavelength scale translating to a dis- 
persion of ~ 1 — 2A/pix and a resolution A/AA ~ 1800 in 
the wavelength range 3800 — 9200 A. Objects are observed 
initially for 2700 sec. Then are reobserved in 900 sec blocks 
until the median S/N is greater than ~ 4pix~^ resulting in 
a S/N distribution with a mean at ~ 13pix~^. 

The spectra are extracted and reduced with the SPEC- 
tro2d pipeline and automatically classifled with SPEC- 
troId (Stoughton, et al. 2002). However, in creating the 
Sloan QSO sample used in this paper, ISchneider. et al.l 
l|2007h visually inspect all of the candidate spectra to de- 
termine their classiflcation. 



The size of our sample is such that we do not manually in- 
spect the G IV line in each spectrum. Instead we develop an 
automated routine for measuring the G iv line width in our 
sample. We do not include a long discussion here (details 
are given in appendix |X]), but will state that we have devel- 
oped a line fitting routine that both measures the 50 % inter- 
percentile velocity (IPV) width of the G iv line accurately 
and returns an accurate error for that measurement. Our 
procedure is found to be robust for spectral S/N > 3A~^ 
(observed frame), and so in the analysis that follows this 
S/N cut is applied to our data. 



3.4 Final sample 

To appear in our final sample a spectrum must: be at the 
right redshift to have the Giv line, have S/N > 3A~^, and 
pass our BAL tests. 

To have the G iv line in the spectrum, and enough sur- 
rounding coverage for the continuum fit, requires a redshift 
of >1.5 and >1.6 for 2dF and SDSS spectra respectively. At 
high redshift we impose a further redshift limit of 2: < 3.3. 
Beyond z ~ 3.3 the G iv line becomes mingled with the 
strong sky emission lines at the red end of optical spectra. 
While the sky subtraction in our sample is generally very 
good, residual correlated features can produce spurious re- 
sults. 

A significant proportion (> 10 %) of G iv lines in QSO 
spectra are effected b y strong broad- abso rption features in 
their blue wing (e.g. iTrump. et al.ll2006l ). We develop an 
automated routine for finding BALs in spectra, details of 
this routine are given in appendix JB] here we will simply 
state that overall it removes ~ 35 % of our spectra from the 
final analysis. 

All of these limits result in a final sample of 13,776 
line measurements that we consider reliable and use in the 
following analysis. The line width results for this final sample 
can be found on the 2SLAQ website (www.2SLAQ.lNFO). 



3.2 2dF spectra 

Both 2QZ and 2SLAQ spectra were taken with the 2 de- 
gree Field (2dF) instrument o n the Anglo- Austra lian Tele- 
scope with the 300B grating (|Lewis. et al.ll2002f) . Spectra 
have a dispersion of 4.3A/pix and a resolution of ~ 9A 
in the wavelength range 3700 — 7900 A. 2QZ exposure times 
were between 3300 and 3600 sec compared with 14400 sec for 
2SLAQ. The increase in exposure time for the fainter 2SLAQ 
sample results in S/N distributions that are almost indis- 
tinguishable (Both peak at ~ 5.5pix~^ for positive QSO 
IDs). 2dF spectra ar e extracted and manually cl assified with 
the 2dFDR pipeli ne JBailev fcGIazebrooSi 19991 ) and autoz 
redshifting code (|Groom et al.ll2001I ). 

The main difference between reduced SDSS and 2dF 
spectra is the lack of fiux calibration for 2dF sources. An 



4 RESULTS 

Before presenting the results of our line fitting we should 
comment on the redshift distribution of our sample. The 
2QZ, 2SLAQ and SDSS QSO samples are all selected based 
on their optic al colours. At lower reds hifts the UV excess 
technique (e.g. ISchmidt fc Greenlll983l ) or a variant is able 
to distinguish QSOs as 'bluer' than stars. This becomes inef- 
fective for higher redshifts when the Lyman break enters the 
[/-band, reddening the colour of QSO. At redshifts ~ 2.5 — 3 
QSO optical colours become intermingled in the stellar lo- 
cus and tar get selection is very i ncomplete over this range 
of redshifts JRichards. et al.ll2002l ). Beyond z ~ 2.5, dropout 
techniques can be used to find high redshift targets due to 
their lack of blue-UV flux. 
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Table 1. Summary of the surveys from which we obtained spectra. Successive surveys have fewer spectra but go deeper, increasing 
our luminosity range at a given redshift. Note that the magnitude limits quoted for the SDSS QSO survey are those for the primary 
QSO survey. The high redshift sample goes deeper, and included are sources observed under different selection criteria and also QSOs 
identified as part of other surveys. 



Survey 


No. of Objects 


Mag. Limits 


Resolution 


Dispersion 


S/N 


SDSS (DR5) 

2QZ 

2SLAQ 


77,429 
23,338 

8,492 


19 > i > 15 
20.85 > 6j > 18.25 
21.85 > 9> 18.00 


~ 165 km/s 
~ 465 km/s 
~ 465 km/s 


~ 1.5 A/pix 
~ 4.3 A/pix 
~ 4.3 A/pix 


~ 13/pix 
~ 5.5/pix 
~ 5.5/pix 
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Figure 1. The absolute magnitudo-redshift distribution for all 
objects with an accepted fit to the C IV line. 



The result of these selection problems is a very un- 
even redshift distribution for high-z QSOs. Fig.[T]shows the 
magnitude-redshift distribution of objects with a C iv line 
measurement. As can be seen, the vast majority of objects 
are found in the main surveys at redshifts less than '-^ 2.5 
and there is a sparse number of high-z selected SDSS QSOs 
up to 2 = 3.3. In addition, the irregular magnitude distribu- 
tion evident in Fig. [T] is caused by the differing magnitude 
limits of the surveys we draw our data from. 

In Fig.[T]we use r-band magnitudes since shorter wave- 
length band passes will be affected by the Lyman break at 
the redshifts we are sampli ng. We A'-correct th e se usi ng the 
SDSS QSO composite of iVanden Berk, et all l|200ll) . The 
2QZ catalogue has data for the photograph ic r-band mag- 
nitud e of objects as opposed to SDSS r (JFukugita et al.l 
Il996r ). However, the r band photometry is incomplete as 
2QZ QSO candidates could be selected without an r-band 
detection. For consistency we use the foj magnitudes from 
the 2QZ catalogue. After if-correcting the foj magnitudes 
we use a constant colour correction calculated from the 
IVanden Berk, et al.l (|200l|) template to transform to the 
SDSS r-band. We note that there are very few 2QZ QSOs 
beyond redshift three which is roughly the point at which 
the Lyman break enters the 6j band. 

Following previous authors fe.g. [Richards, et al.ll2006r ). 
rather than normalising the ii'-corrections to z = Q we use a 
redshift that is more representative of our data and removes 
systematic errors arising from the large extrapolation to z = 
0. We choose z = 2.5 as the zero-point of our if -corrections, 
this amounts to a constant offset of Mr(z = 0) — Mr(z — 




M^(z = 2.5) 

Figure 2. The results of our fitting of the C IV line. We plot the 
absolute r-band magnitude of the source vs. the measured IPV 
width of the C IV line. Over-plotted are lines of constant SMBH 
mass (dotted) and Eddington ratio (dashed) as a guide to where 
these objects fall in mass-accretion space. Masses are labelled in 
units of M0 . 



2.5) — 0.36. At z — 2.5 we correct our 6j magnitudes by 
fej - r = 0.02. 



4.1 Luminosity vs. line width 

Fig. [2] shows our sample's distribution in absolute 
magnitude-line width space. Contours on the plot are 
equally spaced in terms of the log of the density of points. 
As a rough guide to how these measurements would convert 
to SMBH mass and accretion efficiency we have added to 
the diagram lines of constant SMBH mass (dotted) and Ed- 
dington ratio (dashed) . These were calculated assuming the 
[Vcstcrgaard & Peterson (2006) virial mass calibration. 

The distribution appears to be constrained along lines 
of constant SMBH mass and Eddington ratio such that the 
line at Mbh = 10^" M© defines the top of the distribution. 
The lower limit appears to be at Eddington ratios at or 
around one, indicating very little super-Eddington accretion. 

Fig. [2 ] can be directly compared with Fig. 5 in 
iFine. et aU l|2008[) which shows equivalent results for the 
Mg II line. The two plots are qualitatively very similar. There 
is some indication of an offset between the two distributions 
in SMBH mass-accretion efficiency space with the C iv dis- 
tribution tending towards slightly larger Eddington ratios 
and lower SMBH masses. However, one must keep in mind 
that the normalisation of these lines is uncertain, potentially 
by as much as 0.5 dex. The differing zero-points in the Mgll 
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Figu re 3. The IPV width of the Mgll line (taken from lFine. et alj 
[2003) plotted against that of C IV for the 3197 objects with a mea- 
surement of both features. The solid line shows the 1:1 relation 
while the dashed line shows the expected ratio between the lines 
assuming equation \5\ The cross indicates the mean and rms of 
the distribution. 



BLRs is proportional to their ionisation potential, x, then 

Umg II XMg II 

If we assume the flux follows an inverse square law and that 
the gas density is approximately constant, and we assume 
that the velocity field is dominated by virial motion then 



F oc r 



which implies that 



uh ^ const. 



and 



VC IV 
VMgu 



XCiv 
XMg II 



(4) 



(5) 



While this model is clearly an oversimplification of the 
BLR it gives the prediction that the ratio between the line 
widths of Mg 11 and C iv should be the fourth root of their ra- 
tio in ionisation potential, or a factor of ~ 1.58. The dashed 
line in Fig. [3] shows the iiciv ~ l-SSuMgii relation. The 
dashed line agrees remarkably well with the measured zero- 
point of the distribution, and indicates why we may expect 
an offset in the observed distribution. On the other hand, 
the scatter around the zero-point demonstrates that there 
is no simple way to relate the width of the Mg 11 and C iv 
lines. 



and C IV virial calibration s and/or sy s temat ics in the way 
the lines in Fig. [2] and in iFine. et al.l l|2008l ) were derived 
could explain the difference between the figures. 



4.2 Direct compariso n with Mg 11 line widths from 
iFine. et al] (|2008l ) 

QSOs with redshifts between 1.5 and 2.3 will have both C iv 
and Mg 11 in their optical spectrum, and can be used to make 
direct comparisons between the two lines. The requirement 
to have a spectral window around the emission line for con- 
tinuum fitting means that the final sample of QSOs with 
measurements for both the C iv and Mg 11 lines is limited to 
a redshift range 1.6 < z < 2.0. In this range we have 3197 
spectra tha t have measureme nts of both the emission lines; 
here and in lFine. et al.l (120081 1 . Fig. |3] compares the velocity 
width of the Mg 11 and C iv lines for these objects. 

Two aspects of Fig. [3] are of particular interest to this 
work. 1) there does not appear to be an obvious correlation 
between the widths of the two lines, and 2) there is a clear 
offset such that the mean of the distribution does not lie on 
the 1:1 line. We discuss each of these points below. 



4-2.1 Reasons for offset 

The ionisation potential of Civ is ~ 6 times that of Mgll. 
So we might expect it to be emitted from a region closer 
to the continuum source where the ionisation parameter 
is higher. Stratification of emi ssion regions is a n atural 
part of some BLR models (e.g. [Baldwin et al.lll995f ). and 
is borne out observation ally by reverberation mapping (e.g. 
lOnken fc Peterson|[20o3 ). 

If we take the oversimplified case in which the ionisation 
parameter, U {U oa F/nn', where F is the ionising flux and 
nH is the gas density) , required to ionise the Mg 11 and C iv 



4.2.2 The Civ-Mgll correlation 

The widths of M gll and H/3 have been shown to corre- 
late well for QSOs (|McLure fc Jarvis|[20o3 : ISalviander et all 
[2007) and we might expect a similar correlation for C iv. 
However, Fig.Odoes not display a clear correlation between 
Mgll and Civ line width (see also lShen et al.ll200a ). 

If we perform a Spearman rank test on our data we find 
a significant correlation (r^ = 0.35; P{rs) <C 0.01). On the 
other hand, we find 0.09 dex scatter around a y-on-x best 
fit to the distribution, hardly reducing the 0.1 dex scatter in 
the original data. 

If the average ratio between Mg 11 and C iv line width 
changes with luminosity or redshift, any correlation in Fig. [3] 
would be blurred by our inclusion of QSOs with a range of 
these properties. As a test we bin our sample by luminosity 
and redshift, and recalculate the Spearman rank coefficient 
in each bin. The sample of objects that have both C iv and 
Mgll in their spectra span a relatively small redshift range, 
and so we divide the sample into two redshift bins by the 
approximate midpoint at z = 1.8. We then also divide the 
sample into half-magnitude bins. Table[2]shows the number 
of objects in each bin and the calculated r^. It is clear that 
Ts increases somewhat with luminosity, and stays roughly 
constant with redshift. The zero-point to the relation stays 
almost constant with respect to luminosity, varying by less 
than 0.01 dex between the the faintest and brightest bins. 

It is likely that reduced measurement error is the cause 
of the improving correlation (increasing r^) with increasing 
luminosity. In table[3]we bin our sample by S/N rather that 
luminosity or redshift. Table [3] shows that the correlation 
improves for objects with higher S/N spectra. We also give 
the mean percentage error on the C iv line width measure- 
ments in each bin (CiV line widths invariably have larger 
measurement errors; see section I5.3|l , and the rms scatter 
around the y-on-x least-squares fit. 

Table |3] shows that the correlation between Mg 11 and 
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Table 2. These data look at the correlation between Civ and 
Mgll line widths when our sample is binned by luminosity and 
redshift. We give the magnitude limits of each luminosity bin as 
well as the number of objects in each bin and the Spearman rank 
coefficient. These results are shown for objects in two redshift 
bins as well as the whole sample. 





z < 1.8 


z> 1.8 


All 




Mr(z = 2.5) range 


N 


Ts 


N 


r-s 


N 


Tg 


M > -26 


475 


.24 


152 


.27 


627 


.24 


-26 > M > -26.5 


471 


.32 


164 


.35 


636 


.33 


-26.5 > M > -27 


614 


.34 


351 


.32 


966 


.33 


-27 > M > -27.5 


275 


.47 


270 


.39 


546 


.43 


-27.5 > M > -28 


122 


.49 


143 


.30 


265 


.39 


-28 > M 


46 


.39 


73 


.44 


119 


.41 



Table 3. These data look at the correlation between Civ and 
Mgll line widths when our sample is binned by S/N. The first 
column gives the S/N range of the bin. Also given is the number 
of objects, the Spearman rank coefficient, the rms scatter around 
a y-on-x least squares fit, and the mean percentage error on the 
line width measurements in each bin. 









rms around 


Mean 


S/NA-i range 


N 


Ta 


best fit (dex) 


error (%) 


3 < S/N < 5 


238 


.22 


0.129 


21 


5 < S/N < 8 


564 


.25 


0.104 


16 


8 < S/N < 13 


909 


.35 


0.087 


10 


13 < S/N < 20 


876 


.42 


0.073 


7 


20 < S/N 


522 


.42 


0.082 


5 



C IV line widths may be somewhat better than shown in 
Fig. 131 Fig- 2] shows the relation between Mgll and Civ 
line widths for the 37 spectra in our sample that have both 
lines and S/N > 40 A~^. The correlation in Fig. |4] is, per- 
haps, more apparent than Fig. O However, a Spearman 
rank test implies the correlation is only marginally signif- 
icant {jTa — 0.36; P{ra) = 0.03). The rms scatter around 
a y-on-x regression line is 0.055 dex compared to 0.060 dex 
raw scatter in C iv line widths. We also measure the intrin- 
sic scatter in the data, accounting for measurement error, 
around a best-fit that minimises the 2D x^ smd find it to be 
0.055 dex. 

To summarise, these results show that there is a signif- 
icant correlation between the widths of Mg ll and C iv lines 
in our sample. However, the correlation is weak, in that the 
dynamic range of line widths is only marginally broader than 
the intrinsic scatter around a best-fit regression line. 

There are many potential sources of intrinsic scatter in 
Fig. [3] As we have seen, the higher ionisation potential of 
C IV may indicate that its emission arises in a smaller part of 
the BLR, closer to the SMBH. It has been suggested that the 
emission regions for high ionisation lines m ay differ dynam- 
ically from that of lower ionisation lines (jRichards. et al.l 
|2002| : [Elvig|2004l '). There is, therefore, some concern as to 
how well the C iv virial mass estimators would agree with 
those for H/3 and Mgll (see discussion in section (6]). 
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Figure 4. The correlation between Mg II and C IV line widths in 
the highest S/N (>40A~^) spectra in our sample. 



DISPERSION IN THE LINE WIDTH 
DISTRIBUTION 



Motivated by results presented in lFine. et al.l (|2008l ) we ex- 
amine the second moment of the C iv line width distribu- 
tion; in particular, we are interested in h ow it changes with 
QSO luminosity. In brief, the analysis of lFine. et al.l (12008! ) 
is performed in three steps: 

• Bin our sample by redshift and luminosity 

• Calculate the 68.3 % interpercentile range of the line 
widths to characterise the dispersion in each bin. 

• Correct the dispersion with the m edian error on the 
line widths in each bin by equation 7 in lFine. et al.l (12008! ) 
to take account of scatter in the data due to measurement 
error. 

We find that the dispersion results for C iv are consider- 
ably more sensitive t o the methods used in their derivation 
than was the case in iFine. et al.l (120081 ) for Mgll. The in- 
creased sensitivity to the fitting procedure is due to two com- 
pounding issues. Firstly, since the process of measuring the 
width of the C iv line is more complicated than for Mg ll, the 
resulting errors on these widths tend to be larger. Secondly, 
we find less intrinsic dispersion in C iv line widths than was 
found for Mgll. A narrower intrinsic line width distribution, 
combined with larger measurement errors, makes deconvolv- 
ing their separate effects on the measured line width distri- 
bution more difficult. 



5.1 What affects the dispersion in the measured 
IPV width distribution? 

The measured IPV width of a line and its associated error 
depend on the spectral window in which the IPV width is 
calculated. We find that the way this window is defined can 
affect the results we derive for the dispersion in the IPV 

width distribution. 

When analysing the Mg ll line, iFine. et al] (|2008l ) de- 
fined a region in the spectrum between ±1.5 times the 
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FWHM of a Gaussian fit to the line witliin which they cal- 
culated the IPV width. We find that adopting a fixed region 
between 1475 and 1625 A, within which to calculate the IPV 
width, gives more stable results for the C iv line. We choose 
this region as being wide enough to enclose >99 % of the flux 
for average C iv lines and >90 % for the broadest lines. We 
do not make the limits wider as they would then encroach 
on the region used for the continuum fit. Here we discuss 
the effect of using a fixed region for our IPV calculations 

In Fig. [S] we show the dispersion in IPV width as a 
function of S/N. We do not distinguish between 2dF and 
SDSS spectra since we find no difference between them when 
compared at the same S/N. Fig. [S{a) shows the behaviour 
when the IPV width is measured over a spectral region de- 
fined by the Gaussian fit to the C iv line, while the widths 
in (b) are calculated in a fixed spectral window. In each 
plot the measured dispersion in IPV widths is shown as the 
open squares; this is corrected by the median error (shown 
as circles) to give an estimate of the intrinsic dispersion in 
Civ line widths (shown as filled squares with error bars). 
The dashed line shows S/N=3A~^, below which we know 
our line width measurements are biased from simulating low 
S/N spectra. 

We find that employing a fixed region for the IPV width 
calculation results in the derived intrinsic dispersion in IPV 
widths being less dependent on the spectral S/N. The disper- 
sion in IPV widths calculated with a variable region tends to 
be greater at lower S/N and their errors tend to be smaller 
than those calculated using a fixed spectral window. 

In general, the fixed window will be larger than a win- 
dow defined by the Gaussian fit, and one expects the errors 
on the IPV widths calculated over a fixed window to be 
larger. It is less clear why we find less scatter in IPV widths 
calculated with a fixed spectral window when compared with 
those calculated over a variable region. The indication is that 
using our Gaussian fits is adding uncertainty to our IPV cal- 
culations in some manner that is not reflected in their errors. 
The level of uncertainty is very low (<0.01dex for S/N'^4). 
It is possible that the non- linear multi- Gaussian fits to the 
spectra are unreliable at this level. 

To have confidence in our analysis we need to identify 
what causes the increase in the corrected dispersion in IPV 
widths calculated over a variable spectral window as we go 
to lower S/N. Is this due to increased noise? Alternatively, 
because of the correlation between the intrinsic luminosity 
of a source and the spectral S/N, is this due to an inherent 
property of the quasars in our sample? i.e. is this increase 
due to a correlation between the intrinsic dispersion in C iv 
line widths and qu asar luminosity as was evident for Mgll 
(see iFine. et al.1 [2008 ) ? 

The S/N and absolute magnitude of the QSOs in our 
sample do correlate (Fig. [6)|. To test what is causing the 
increase in the corrected dispersion in IPV widths calculated 
over a variable spectral window as we move to lower S/N we 
add noise to high S/N spectra, calculate the IPV width in 
this noisier spectrum and then replot the dispersion in IPV 
widths as a function of S/N. 

We degrade every spectrum in our data such that their 
resulting S/N is 1/3 its original value. In doing so we also 
modify the error on each pixel to take account of this ad- 
dition of noise. Fig. [7] shows the new dispersion results as a 
function of S/N. When we add noise to the spectra artifi- 




Figure 6. The relation between the spectral S/N calculated in 
the local C IV region of the spectrum and the absolute magnitude 
of quasars in our sample. 



daily we find an almost identical relation between dispersion 
in IPV width and S/N as was evident in Fig. [5] 

The similarity between Figs. [S] and [7] indicates that the 
increase in corrected dispersion observed towards lower S/N 
is due to noise and not an intrinsic property of the QSO 
emission lines. To further illustrate the effect we average the 
dispersion calculated with our original data for S/N>9 and 
plot this in Fig. [7] as the dotted line. For S/N>9 in Fig. \5\ 
the dispersion in IPV width is relatively constant with S/N. 
A S/N of 9 in Fig. [5] corresponds to S/N=3 in Fig. [71 Clearly 
the points at S/N>3 follow the dotted line in Fig. [36) more 
closely than in (a). 

Fig. [8] compares the IPV line widths measured over a 
fixed spectral window, and with a window which is defined 
by the Gaussian fit. In Fig. [Sja) we plot a straight com- 
parison between the measured widths and in (6) the ratio 
is plotted. There is scatter between the values, furthermore 
this increases for narrower line widths where the difference 
between the IPV windows will be greatest. For narrower 
lines the IPV widths measured over a fixed range tend to be 
larger, although there is scatter in both directions. Overall 
it does not seem that we are biasing our results significantly 
by fixing the region over which we calculate the IPV width. 

Since the relation between the corrected dispersion in 
IPV widths and S/N is fiatter when using a fixed spectral 
window to calculate the IPV widths, we prefer this method 
for our final fitting procedure. There remains a slight trend 
in the corrected dispersion with S/N such that the corrected 
dispersion increases by ~ 0.015 dex towards S/N=3. We find 
that the trend can be altered depending on the statistics 
we employ to define the dispersion in IPV widths and the 
average error used in the correction. 

In all of the above fig ures we have followed the method 
used in [Fine, et al] (|200a i. We have taken the 68.3 % inter- 
quartile range to parametrise the dispersion in the IPV 
width distribution, and used the median error to correct 
to the intrinsic dispersion. 

Taking the rms of the IPV width distribution and cor- 
recting by the mean error we find the results shown in Fig. [5] 
Here the trend is less than in Fig.[5j6), although the effect is 
small for S/N > 3. Nonetheless, in the analysis that follows. 
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Figure 5. The dispersion in C IV IPV width as a function of spectral S/N. In (a) the IPV width is calculated over a spectral window 
defined by the Gaussian fit to the line. In (fe) the window has a fixed range. The raw dispersion in IPV widths is plotted as open squares, 
this is corrected by the median error (filled circles) to estimate the intrinsic dispersion in Civ line widths (filled squares). The dashed 
line shows S/N=3 below which we do not have confidence in our fitting. Above S/N=3 the fits in (a) show slightly more dependence on 
S/N than in (6). 
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Figure 7. The dispersion in IPV line widths measured from spectra which have had their S/N degraded by a factor of three. In (a) the 
IPV width is calculated over a spectral window defined by the Gaussian fit to the line. In (ft) the window has a fixed range. Symbols are 
as in Fig. [S] the dashed line shows the S/N=3 line below which we do not trust our fitting results. The dotted line shows the dispersion 
in IPV widths for objects with original spectra with S/N>9, and should line up with degraded spectra with S/N>3. 



we employ the rms to calculate dispersion and correct by 
the mean error of the IPV widths. 



5.2 Dispersion in IPV widths vs. luminosity and 
redshift 

To characterise the dependence of the dispersion in C IV line 
width on quasar luminosity and redshift we bin our data 
by L and z and calculate the dispersion in each bin. Due 
to the uneven redshift distribution of our sample (Fig. [TJ 
we do not choose evenly spaced redshift bins for our anal- 
ysis. Fig. [To] shows the cumulative redshift distribution of 
our sample over which we have marked the limits of the red- 
shift bins we will use. The three lowest redshift bins have 
roughly equal width and equal numbers of objects. The red- 
shift bin centred at 2 = 2.5 contains QSOs approximately 
during the quasar epoch. Finally we have a high redshift bin 
of objects with z > 3 , potentially before the quasar epoch 
IJRichards. et al.ll2006l ). 

In each redshift bin we calculate the dispersion in C iv 
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Figure 9. Here we show the measured dispersion in Civ line 
widths as a function of S/N. Open squares show the raw disper- 
sion measured as the rms of the distribution. These are corrected 
by the mean error on the line widths (circles) to estimate the 
intrinsic dispersion in line widths (filled squares). 
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Figure 8. Comparisons between IPV widthis calculated over a fixed spectral window, and with a window which is defined by the Gaussian 
fit. (a) shows the direct comparison and (6) shows their ratio as a function of line width. 
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Figure 10. The cumulative rodshift distribution of objects in our 
sample, i.e. the y axis plots the number of objects with redshift 
less than z. Vertical lines on the plot show the limits of the redshift 
bins we will be using in our analysis. 




M,(z=2.5) 

Figure 11. The dispersion in Civ line width in magnitude- 
redshift bins as a function of magnitude. Each redshift bin is 
plotted with a different colour and the midpoint of these bins is 
indicated on the top right. 



IPV line width as a function of luminosity and plot the re- 
sults in Fig. [11] We find no clear trend between QSO lumi- 
nosity and the dispersion in C iv IPV width. The lowest four 
redshift bins appear to be equivalent, but there is an indi- 
cation that the highest redshift bin is offset to a higher dis- 
persion. Furthermore, there is a suggestion that the highest 
redshift bin shows an inverse correlation between the disper- 
sion in C IV line width and luminosity although the dynamic 
range is small. 

The evidence for increased dispersion at high redshift 
is inconclusive; if the result is real it may be indicating that 
there is more scatter in SMBH mass/Eddington ratio. At 
such high redshifts the SMBH mass function can only be 
steeper and we would expect to find many fewer SMBHs 
with Mbh > 10^" . It is hard to imagine a population of ob- 
jects capable of broadening the active SMBH mass distribu- 
tion towards higher masses at such an epoch. Alternatively, 
there could be more super-Eddington accretion at high red- 
shift. But again, it is difficult to imagine a reason why the 
Eddington limit would be a weak constraint at z > 2.6 and 
then become a stronger limit at lower redshift. As an alter- 
native (non-virial) explanation, the C iv line width could, 
potentially, be related to outffows from QSOs. The larger 
dispersion in C iv line widths at high redshift may be indi- 
cating that outffows were more common/more varied in the 
high redshift Universe. 



5.3 Comparing the dispersion in Civ and Mgll 
line widths 



iFine. et al.l (|2008l ) showed that the dispersion in Mgll line 
width depends on QSO luminosity, but Fig. [11] shows no 
clear signs for such a depend ence for C IV. In Fig. 1121 we 
take the Mgll line widths from JFine. et al.l l|200a ) and make 
a direct comparison between the dispersion results for C iv 
and Mgll. 

In Fig. [T2] we calculate the intrinsic dispersion in the 
line width distributions in the same way for both C iv and 
Mgll. That is we calculate the rms of the IPV width dis- 
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Figure 12. Comparison between the dispersion in Mgll (squares) 
and C IV (triangles) line widths. Each are plotted against the ab- 
solute r-band magnitude /iT-corrected to ^ = 2.5. Open symbols 
show the raw dispersion in the data calculated as the rms of the 
line width distribution. Solid symbols give the implied intrinsic 
dispersions once we have corrected by the mean of the errors on 
the IPV width measurements. 



tribution (open symbols) and correct this by the mean of 
the errors on the IPV width in each bin to give the intrin- 
sic dispersion in line width (shown as solid symbols). There 
is a clear offset between the Mg ll and C iv data. The un- 
corrected dispersion in IPV width is larger for Mgll at all 
but the brightest luminosities. In addition, the errors on the 
Mgll line widths are smaller. 

The width of the Mg ll line can be measured with more 
precision than Civ for two reasons. Firstly, in the case of 
C IV we fit a linear continuum between two 45 A wide win- 
dows either side of the C iv line. We are forced into using 
small windows due to emission from other (sometimes un- 
known) ions in the C iv line region. In the case of Mg ll a 
much wider (> 450 A) spectral window can be used in the 
continuum fit. This results in a considerably more precise 
model for the local emission. 

In addition to a more precise continuum fit, Mgll does 
not have local contaminating emission (apart from iron emis- 
sion that is well fit by a single template) which makes the 
CiV fitting more difficult. The Hell and Olll] emission lines 
on the red wing of C iv add further uncertainty into the line 
width calculations. 

The differing behaviour between C iv and Mg ll in 
Fig. 1121 combined with the lack of a strong correlation be- 
tween these lines (Fig. |3]), lends further weight to the ar- 
gument that the emission regions for these two lines are 
distinct. 



6 VIRIAL SMBH MASS ESTIMATION WITH 

Mgii AND Civ 

Above we have compared C iv and Mg ll line widths mea- 
sured in spectra which have both lines, and find considerable 
scatter in the comparison. We find that the dispersion in C iv 
line width is smaller than that for Mg ll and does not show 
the same trend with luminosity that Mgll exhibits. 

Overall, the differing behaviour of the Mg ll and C iv 



lines is a concern when employing these as virial SMBH mass 
indicators. The lack of a clear correlation between the widths 
of the two lines may be of most concern since, for a virialised 
BLR, these should correlate well. However, even without a 
correlation between Mg ii and C iv line widths, virial SMBH 
estimations using the two lines are still consistent. 

The consistency occurs because the dynamic range in 
line widths for both Civ and Mgll is less than ~ 0.15 dex, 
or ~ 0.3 dex in SMBH mass (under the virial assumption). 
The quoted uncertainty on virial SMBH mass estimates is 
typically larg er than ^ 0.3 dex (e.g. 0.32 dex for the C iv cali- 
bration from Vestergaard fc Petersonll2006l . and 0.33 dex for 
Mgll from iMcLure fc DunlodbOoi T Hence the line width 
term in virial mass estimators has only a weak effect on 
estimates for SMBH masses: The luminosity term is respon- 
sible for the dynamic range of virial SMBH mass estimates. 
Therefore, virial SMBH mass estimates from different lines 
correlate, even if the widths of the emission lines themselves 
show no correlation. If virial mass calibrations can not es- 
timate SMBH masses to a higher degree of accuracy than 
the dynamic range in line width, the usefulness of the line 
width is unclear. Instead, virial estimators may only appear 
to work due to their luminosity dependence. 

While many of our results imply there are problems 
with the virial technique for estimating SMBH masses, we 
also find evidence supporting the virial assumption. We find 
that the offset between the average C iv and Mg ll line width 
is consistent with a simplistic model which assumes a pho- 
toionised BLR with virial velocities; potentially this indi- 
cates that both the high and low ionisation BLR are viri- 
alised and so could be used to estimate SMBH masses. How- 
ever, scatter around this mean relation shows that this sim- 
ple interpretation alone is inadequate. 

It is unclear how all of our results can be accommodated 
in a single model for the BLR, and what the eventual impact 
will be on the virial technique. However, it is clear that care 
needs to be taken when using virial SMBH mass estimates, 
and a better understanding of the BLR is necessary before 
virial mass estimates can be considered to be unbiased. 



7 THE GEOMETRY OF THE C iv BLR 

If the C IV BLR velocity field is in any way asymmetric then 
the width of the C IV line will depend on the viewing angle. 
We find only a very small level of dispersion in Civ line 
widths at all luminosities, and use this to constrain models 
for the velocity field of the BLR. 

Perhaps the most common toy model for the BLR con- 
sists of a component confined to a disk (either rotating 
or as a wind) as well as a random isotropic component. 
Various paramete risations for this model can be found in 
the literature (e.g iJarvis fc McLurell2006l : ICollin et al.ll2006l : 



iLabita et al.ll2006l : lFine. et al.ll2008l ). In the following discus- 
sion we will use 



line width : 



sJvlsva^{e)/2 + v^/?, 



(6) 



where Vd and «,- are the disk and random velocities respec- 
tively. This model assumes that the disk and random com- 
ponents have approximately Gaussian velocity profiles, and 
that the emitting regions are not distinct. The factors of 2 
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Figure 13. This shows how a, as defined by equation |8] is Um- 
ited by our data as a function of the opening angle of QSOs (soUd 
line) . The parameter space above and to the right of the line is in- 
consistent with our results. For comparison with Fine et al. (2008) 
we also constrain a by their parametrisation for the BLR (dashed 
line) 



and 3 occur due to the dimensional confinement of the two 
components. 

We are not interested in absolute line width measure- 
ments, only the effect of orientation. We can therefore trans- 
form equation |6] into a function of a single variable that 
describes how disk-like (or not) the BLR is. To this end 
we must tie Vd and Vr together under some assumption. 
We assume that, viewed edge on (i.e. at 8 = 90°), the 
disk and random components are indistinguishable. That 
is Vd/2+v'^ /3— const. We then define a by 



^'i/2 



vy2 + v?./3- 



(7) 



Hence if a = 1 the BLR is disk-like and if a = it is spheri- 
cally symmetric. We then have 



line width oc 



\e) + {l-a^) 



(8) 



as our parametrisation for the BLR. 

Given our simple model we have defined the effect of 
orientation on line width and, assuming an opening angle 
to QSOs defined by the ex tent of their molecular torus (see 
Fig. 13 in lFine. et al.ll2008l ). we can calculate the dispersion 
in line width due to orientation effects. Fig. [12] shows that 
there is ~0.08 dex scatter in C iv line width. We use this 
value to calculate an upper limit for a depending on the 
assumed opening angle to QSOs. That is the maximum value 
a can take to be consistent with our data. 

Fig. [T2] shows how the limit on a depends on the as- 
sumed opening angle to QSOs (solid line) . To facilitate com- 
parisons with Fig. 15 in lFine. et al.ll2008l we have also plot- 
ted the same constraint assuming the parametrisation used 
in that paper (dashed line). In both cases very disk- like 
BLRs are ruled out at all assumed opening angles. This is 
an extension of the argument that, if the BLR is a confined 
disk, we should find more narrow-line QSOs. Since we do 
not, the BLR mu st have a significant non-disk component 
l|Osterbrocklll977l ). 

We can perform the same tests assuming a velocity field 
that is constrained in the polar direction, potentially a BLR 
which is part of a wind. In this case our parameterisation 
for the BLR becomes: 



line width ex \/a'^ cos'^{9) -f (1 — a^). 



(9) 



Figure 14. This shows how a, as define by equation [9] (solid line) 
is limited by our data as a function of opening angle. Again we also 
show the constraint following Fine et al. (2008) for comparison 
(dashed line). 



Fig [14] plots the results for this model. Here, even with very 
constrained BLRs (i.e. a ~ 1), we can only rule out the 
model if the opening angle to QSOs is large (> 60°). 



8 CONCLUSIONS 

We have performed an extensive analysis of the Civ line 
width distribution in QSO spectra from the 2SLAQ, 2QZ 
and SDSS. We reviewed the three most common methods 
for fitting C iv emission in QSO spectra and performed a 
detailed comparison between these, using both composite 
spectra and individual fits to our whole dataset. Based on 
these results we have chosen the procedure that we believe 
is the most physically motivated and least biased to employ 
for our analysis. Furthermore we have developed a procedure 
for identifying absorption features in spectra and removing 
BAL systems from the dataset. 

Applying our routine to spectra from the SDSS, 2QZ 
and 2SLAQ surveys we have measured the Civ line width 
distribution for QSOs in the redshift range 1.5 < z < 3.3 
spanning a magnitude range of —24 > Mr{z — 2.5) > —29. 
We find that the line width vs. luminosity plot (Fig. [2]) shows 
many similarities with the equivalent plot for Mgll. How- 
ever, this appears to be where the similarity ends. 

We compare the C iv and Mg ll line widths calculated 
in spectra that have both lines, and find considerable scatter 
in the comparison. We find that the dispersion in C iv line 
width is smaller than that for Mgll and does not show the 
same trend with luminosity that Mg ll exhibits. 

These results are discussed in terms of virial SMBH 
mass estimation. We show that, for both Mg ll and C iv esti- 
mators, the line width term contributes considerably less dy- 
namic range to the resulting mass estimate than the quoted 
error on the estimate. The dynamic range found in virial 
SMBH mass estimates comes (almost entirely) from their 
luminosity term, and this is solely responsible for the con- 
sistency of virial masses based on the two lines since the line 
widths do not correlate. 

Finally the results are discussed in terms of BLR dy- 
namics. We show that given the small scatter in C iv line 
widths the C iv BLR cannot be a fiat disk. We parametrise 
models for a hybrid BLRs which include a random/isotropic 
velocity component and show how our results can be used 
to constrain these models. 
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APPENDIX A: FITTING THE C iv LINE 

QSO emission in the vicinity of the Civ line is complex 
and there is no simple, accepted prescription for fitting 
its profile. Complex iron emission is likely to pervade the 
C IV line region. In addition, the red wing of the line is 
contaminated by the weak Hell A1640 and Olll] A1663 
lines (the Qui] line may also be blended with Alll A1671, 
but we shall just refer to this blend as Olll]), as well as 
a significant c ontribution from an unid e ntified source at 
^ 1600 A (e.g.lWilkej|l984l : lBovlelll990l : iLaor et al.|[l993 : 
I Vanden Berk et all 1200 ll ). Much of the difflcuhy in fitting 
the Civ line is due to this unidentified emission in the red 
wing of the line. 

Furthermore, high S/N spectra have shown evidence for 
Niv] A1486 a nd Sin A1531 at low levels in the blue wing of 
the li ne (e.g. ICristiani fc Violll990l : lBovlelll990l : iLaor et al.1 
1 19941 : 1 Vestergaard fc WilkedbOOlh . 

Most of these contaminating lines are weak compared 
with C IV and can be corrected for. The primary difficulty 
when fitting the C iv region of quasar spectra is the excess 
emission at ~ 1600 A. Conflicting fitting prescriptions that 
correct for this feature lead to systematic biases when pa- 
rameterising the C iv line. 



Al The ~ 1600 A feature 

Fig. lAll shows the Civ region from the 2QZ QSO compos- 
ite (|Croom. et al.l 120021 ) . Obvious emission features due to 
Civ, Hen and Olll] are labelled. The composite shows no 
evidence for Niv] or Sill in the blue wing. However, there 
is clearly emission around 1600 A that can not easily be de- 
scri bed as a combinat ion of the labelled lines. 

I Wills et al.l (|l980l ) suggested there should be significant 
Fell emission in the ~1610 — 1680 A range based on models 
for Fell emission due to coUisional excitation. More recent 
models for q uasar iron emission also show significant flux in 
this region (jSigut fc Pradhanll2003l ). However, the models 
do not contain enough flux at 1600 — 1610 A to accurately 
fit observed quasar spectra. 




1400 



1500 1600 
Rest Wavelength (A) 



1700 



Figure Al. The Civ region of the 2QZ QSO composite spec- 
trum (heavy line) with known emission features labelled. In con- 
structing this composite each spectrum is normalised to a fitted 
continuum. The normalised continuum is shown as a dashed line 
to highlight excess emission. The fine line at the bottom of the 
shows the Vestergaard & Wilkes (2001) template for iron emis- 
sion. We have scaled the iron template by an arbitrary constant 
for ease of comparison in the figure. 



The empirical iron template of IVestergaard fc Wilkesl 

(J200lh shows emission between 1590 and 1620 A, but again 
not at the levels re quired to accurately fit observ ed spectra. 
Fig. |XT] shows the Ivester gaard fc Wilkel (|200ll ) iron tem- 
plate. Comparing the template with the composite spectrum 
at 1450 or 1720 A and 1600 A shows why the excess emission 
at 1600 A cannot be accurately described with this template. 
The lack of of flux in the vicinity of Hell and Olll] 
in the Vestergaard & Wilkes iron template is a result of 
their removal of Gaussian fits to the Hell and Olll] lines. 
It may be that in doing so the iron template has been over 
corrected and there may, in fact, be residual iron emission 
in these regions. 



A2 Techniques for fitting the C iv line 

Due to uncertainty in identifying the source of the emission 
around C iv, there is no standard prescription for analysing 
the C IV line, and previous studies have implemented various 
techniques for fitting the C IV region. These approaches can 
all be classified as three different ways of dealing with the 
emission at 1600 A. 

1) Assume the emission at 1600 A is a red wing of Civ. 

2) Assume the emission at 1600 A is due to another species 
and try to correct for it. 

3) Try to fit around the 1600 A feature without attempting 
to explain it. 



We examine each of these line fitting proceedures in 



turn. 



A2.1 The 1600 A feature as a red wing to the Civ line 

One of the more common fitting proceedures treats the ex- 
cess emission at 1600 A as an extended red wing of the C iv 
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line l|Laor et al.lll994l : lFine. et al.ll2006l : fshen et al.ll20oi ). A 
continuum is fitted to the local region (~ 1450—1700 A) with 
or without an iron template included. Then single Gaussians 
are fitted to contaminating features (Hell, Olll], Niv] etc) 
along with three Gaussians to describe the C iv line. Fig. lA2l 
illustrates this procedure applied to the 2QZ QSO compos- 
ite. 

Three Gaussians are needed to fit the C iv line since the 
residual after the continuum and other lines have been sub- 
tracted is markedly asymmetric. However, Fig. IA2r b) shows 
that the resulting fit to the C iv region is able to accurately 
model the spectrum. The flaw in the prescription is that it 
assigns a significant amount of C IV emission at implausible 
velocity shifts from the line centre (jLaor et al.lll994l V and 
that it imposes a strong asymmetry on the C iv line which 
is not obvious in the original spectrum. 




1500 1600 

Rest Wovelength (A) 

Figure A4. The C IV region from the spectrum of QSO 
J222203.6-320437 taken during the 2QZ survey. This spectrum 
shows He II and O III] features, but no other significant emission 
redwards of ~ 1600 A. 



A2.2 The 1600 A feature as Hell emission 

ICroom. et al.l (|2002h fit the emission redwards of C iv with 
a single broad Gaussian centred at 1640 A associated with 
Hell emission. This effectively removes the strong emission 
feature at 1600— 1700 A; however, it does not accu rately re- 
move residual features in the region. iShang et al.l l|2007 ') go 
a step further and fit for narrow O III] and He ll emission on 
top of broad He ll emission. These additional components al- 
low one to accurately reproduce the shape of quasar spectra 
in the C iv region; an example fit is given in Fig. IA3I 

Fig. EH a) shows the continuum subtracted 2QZ com- 
posite spectrum along with dashed lines showing the two 
Gaussians fitted to Hell and the one to Qui]. Fig. lASr b) 
shows the residual spectrum once these components have 
been subtracted. Comparing Fig. IA3f b) with Fig. lA2r &). we 
find the strong asymmetry that results from assuming the 
1600 A feature is C iv emission is not evident when apply- 
ing this prescription for fitting. Since the resulting C iv line 
is almost symmetric it can be relatively well modelled by a 
sum of two Gaussians as in Fig. lASf fc). 

There are two main strengths to this second fitting pre- 
scription. Firstly, each separate component of the fit is as- 
signed to a particular ion; hence it can be argued that the 
prescription makes sense physically. Secondly, the fit pre- 
serves the symmetry of the C iv line. While we do not know 
that the C IV line is intrinsically symmetric, it is difficult 
imagine that emission extending to > 20,000 km/s is associ- 
ated with the C IV line. Furthermore, many individual high 
S/N spectra show strong C iv emission but very little emis- 
sion in the red wing (e.g. Fig. IA4|) . 

We must remember that just because we have fit the 
1600 A feature with a broad He ll component does not nec- 
essarily mean that Hell is responsible for the emission. At 
the point where the broad He ll component becomes blended 
with the red wing of Civ it is not clear that extrapolating 
the Gaussian fits will remove contaminating emission from 
the C IV line correctly. Hence the main concern with this 
fitting proceedure derives from its strength: because we are 
assigning the 1600 A emission to He ii the fit appears to be 
physical, and this clouds the fact that we may be introducing 
an unknown systematic into the results. 



A2.3 Fitting around the 1600 A feature 

A third option for fitting the C iv line is to accept that it is 
unclear how to correct for the emission redwards of C iv and 
to try and perform the simplest , non-parametric correction 
as possible. IWilhite et al.l (|2008r ) fit a linear continuum be- 
tween small windows centred at 1480 and 1690 A, and then 
calculate central moments over the interval 1496 — 1596 A 
to describe the C iv line. 

This approach has the advantage of simplicity and does 
not rely on imposing a specific profile (Gaussians in the ex- 
amples discussed above) on spectral features. Since no at- 
tempt is made to correct for the excess fiux at 1600 A, this 
prescription will likely have larger systematic errors than fit- 
ting for Hell emission in this region. However, the system- 
atics are also likely to be more consistent when compared to 
other fitting proceedures since there are many fewer fitted 
parameters. 

As a small adjustment to lWilhite et al.l (|2008l ) we also 
consider a prescription where the continuum is fitted be- 
tween 1450 and 1610 A, and then calculate parameters for 
C IV in this region as shown in Fig. IA5I The limits are cho- 
sen as the local minima either side of the C iv line, and can 
be thought of as the regions where C iv emission ceases to 
dominate the spectral shape. 

The fitting proceedur e illustrated in Fig. IXSl has the 
advantages of that used by IWilhite et al.l ([2003) i while also 
making a simple correction for emission in the red wing of 
the line. The primary problems associated with this fit are 
the predefined C iv line region and the highly artificial way 
we have corrected for contaminating emission. Each of these 
issues will systematically bias our derived line widths. This 
procedure can be considered as an extreme limit to plausible 
fits. 

In the next section we discuss calculations of the width 
of the C IV line, before making a quantitative comparison 
between the three fitting proceedures in section IA4I 



A3 Calculating the line width 

There are several techniques for parameterising the width of 
a spectral line. Three commonly used measures are the full 
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Figure A2. Fitting the Civ region of the 2QZ composite by assigning three Gaussian eoniponents to the Civ line, and one each for 
Hell and Olll]. (a) The continuum-subtracted composite with dashed lines showing the fits to Hen and Olll]. (6) The composite after 
subtracting the Gaussians fit to He II and O III] . Dashed lines show the three components of the C IV line and the heavy lines shows their 
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Figure A3. Fitting the Civ region by assigning both a broad and narrow component to the Hell line. Here the observed emission 
redwards of 1600 A is removed as Hell and the result is a relatively symmetric profile for Civ. In (a) the continuum subtracted 2QZ 
QSO composite is shown along with the three Gaussians fitted to the Hell (two) and Olll] (one) features. In (6) these lines have been 
subtracted from the composite and two Gaussians have been fitted to the C IV line; their sum is shown as the heavy line. Note that all 
Gaussian are fit simultaneously in the procedure. 
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Figure A5. The simplest prescription for fitting the Civ line, (a) shows the 2QZ QSO composite with a linear fit between windows at 
1450 and 1610 A. In (h) the continuum has been subtracted from the spectrum. 
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width at half maximum (FWHM), the inter-percentile ve- 
locity (IPV) width, and the dispersion (or second moment) 
of a line. 

The FWHM is most commonly employed as the line 
width parameter when considering QSO broad lines. It is 
easy to define and calculate, and for high S/N spectra 
gives an accurate line width. However, when measuring the 
FWHM directly from low S/N spectra problems arise, both 
when defining the maximum flux density of a line and when 
dealing with multiple crossings of the half maximum value. 

These problems can be circumvented by fitting a model 
line to the spectrum and measuring th e FWHM of the 
model rather than from the spectru m (e.g. iLaor et al.lll994l : 
iFine. et aU l2006l : IShen et aU 120081 ): this of course assumes 
that the model gives an accurate representation of the line. 

In the following analysis we measure the FWHM from 
our model fits to the spectra. Our models are composed 
of several Gaussian components. T hese Gaussians ar e fit to 
our data with the mrqmin routine (|Press et al.lll992f ) which 
returns the fitted parameters along with their covariance 
matrix. The error on the FWHM is calculated incorporat- 
ing the covariance between the fitted parameters. However, 
since mrqmin does not take a covariance matrix as input, 
covariance in the c ontinuum-subtracted spectrum (see e.g. 
ICardiel et al.lll998 ) is not incorporated in the final error es- 
timate. Hence the error on the FWHM we calculate will be 
underestimated. 

Another measure of the line width that is becoming 
increasingl y widespread is the dispersion or second moment 
of the line JFromerth fc Meliall2000l : IVestergaard fc PetersonI 



I2OO6I : IWilhite et al.ll2007l ). However, we find that the exces- 
sive weighting this measure assigns to the values of pixels 
in the wings of the lines makes it an unreliable estimator of 
line width in low S/N spectra. 

Inter-percentile velocity (IPV) widths offer a third 
parametrisation of the line widths (e.g. IWhittld 1 19851 : 
iFine. et 311120081 ). While at first glance the process of mea- 
suring an IPV width is similar to measuring the FWHM, the 
dependence of IPV widths on the cumulative flux distribu- 
tion rather than the flux density at a given point makes the 
IPV measurements considerably more robust with respect to 
noise in the spectrum. The IPV width can be very suscep- 
tible to uncertainty in the continuum placement. However, 
even in low S/N spectra a linear continuum can be fit to 
a relatively high degree of accuracy and precision given a 
modest spectral region to fit over. 

Like the dispersion, IPV widths are somewhat affected 
by noise in the wings of lines; in particular this can affect the 
total flux of a line and how one defines the zero-point of the 
cumulative flux distribution. However, when calculating the 
dispersion the weight given to a single pixel is proportional 
to the square of the displacement of that pixel from the line 
centre. This power-of-two dependence makes the dispersion 
highly susceptible to noise in the wings of a line; this is not 
a problem for IPV widths. 

In this analysis we calculate the 50% IPV width for the 
C IV lines in our sample (i.e. the width between the 25 % 
and 75% crossings of the cumulative flux distribution). We 
calculate the IPV width directly from the spectrum, inter- 
polating between pixels either side of the crossings. Errors 
on the IPV widths are calculated from the spectral variance 



array including the contribution of covariance introduced by 
the iron and continuum subtraction. 

Finally, for any line width measure, we subtract the 
resolution of the spectrograph in quadrature from the mea- 
sured line width under the assumption of a Gaussian profile 
for both the emission line and instrumental resolution. 



A4 Comparisons between fitting proceedures 

In this section we compare the above prescriptions for fitting 
the Civ line to highlight the possible biases introduced by 
each. The precise proceedures implemented in each case are: 

1) We fit a linear continuum under the C iv region between 
two 45 A wide spectral windows at 1430 < A < 1475 A and 
1680 < A < 1725 A. The continuum is subtracted from the 
spectrum and we then fit five Gaussians to the residual. 
Two of these have their wavelengths fixed at the expected 
wavelength of He 11 and O ill] ; the final three are taken to 
describe the CiV line. 

Both the FWHM and 50% IPV widths are measured for 
the line. The FWHM is measured from the three- Gaussian 
model for the line while the IPV width is calculated directly 
from the spectrum. 

2) We perform the same continuum fit as in (1). Five Gaus- 
sians are fitted to the continuum-subtracted spectrum, two 
are fixed to He 11 and one to O ill] while the final two Gaus- 
sians describes the Civ line. In the fit, the two Gaussians 
which describe He 11 and the two which describe C iv have 
their central wavelengths fixed to the same value. The three 
Gaussians which were fitted to He 11 and O ill] are then sub- 
tracted from the spectrum; the FWHM is calculated from 
the double-Gaussian model for the line. 

3) As a final prescription we fit a linear continuum between 
20 A wide windows centred at 1450 and 1610 A. The contin- 
uum is subtracted and two Gaussians are fit to the residual 
C IV line with their central wavelengths tied together. 

We compare these prescription in two ways. Firstly, we 
compare fits to the high S/N 2QZ QSO composite spectrum. 
Secondly, we apply each of these routines to our dataset and 
compare the overall results. 



A4-1 Detailed fits to the 2QZ composite 

A quick way to compare the differing fitting proceedures is 
to compare the results obtained when fitting the 2QZ QSO 
composite. The fits to the composite are shown in Figs. IA2I 
IA3I and IA5I Fig. IA6I shows how the final C iv line profiles 
differ for each of the three approaches and table lAll gives 
a selection of line parameters calculated from the differing 
fits. 

The Civ profiles for prescriptionss (1) and (2) are iden- 
tical except for the red wing of the line. In addition all pro- 
files are similar in the core of the line, except fit (3) is some- 
what lower since the continuum is fit higher. 

The measured line parameters bear out these differ- 
ences. The IPV width and, in particular, the 2"^* moment of 
the line are sensitive to fiux in the wings of the lines. Hence 
the values for these parameters depend strongly on the fit- 
ting proceedure used. The FWHM does not have a strong 
dependence on the line wings and is similar for all fits. It is 
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Figure A6. Comparison of the residual Civ line when fitted us- 
ing the three prescriptions described in the text. (1) assumes the 
C IV line is composed of three Gaussians and fits single Gaussians 
to He II and O III] . (2) fits both broad and narrow components to 
Hell and (3) takes a simple linear continuum fit between prede- 
fined spectral windows either side of the C IV line. 



Table Al. A selection of parameters calculated for the Civ line 
in the 2QZ composite spectrum based on the differing fitting pro- 
ceedures. The flux values have been normalised to the value given 
by fit (3). 



Fit Flux 

(rel. to 3) 



FWHM 
km/s 



IPV 

km/s 



2nd moment 
km/s 



1.53 
1.32 
1.0 



4600 6610 
4600 4540 
4180 3390 



8410 
4800 
3120 



slightly smaller in fit (3) since we have subtracted off more 
continuum. 



A4--2 Overall comparisons when fitting the whole dataset 

In addition to fitting the high S/N composite we apply the 
three fitting proceedures to our entire dataset to highlight 
the relative biases of each. Fig. IA7I compares the measured 
FWHM and IPV width of the C iv line when measured via 
each prescription. 

As when fitting the 2QZ composite, each prescription 
gives equivalent results for the FWHM of the C iv line. 
Fit (3) does give slightly smaller FWHMs by a factor of 
~ 1.07 when compared with the other two proceedures; this 
is consistent with the 1.10 ratio obtained when fitting the 
composite (table lAlj) . In addition we find more outliers when 
comparing the results from (1) with (2) or (3), suggesting it 
is a less stable technique. 

Prescription (1) gives significantly higher values for the 



IPV width when compared with the others. The IPV widths 
as measured via (2) and (3) follow a linear relationship. 
Their means are offset by a factor of 1.4, comparable to 
the ratio of 1.5, between the IPV widths measured from the 
2QZ composite. A best fit (found by minimising the 2D x^) 
shows a slight departure from a linear relation with a gradi- 
ent of 0.964±0.002. 



Fig. IA7I suggests that FWHMs offer a robust measure 
of line width that is relatively independent of the fitting 
technique applied. IPV widths are strongly influenced if one 
takes the emission on the red wing of C iv to be a part of the 
line itself; however, they are relatively robust with respect 
to the fitting proceedures if not. This leaves a question as 
to how well FWHM and IPV width measurements correlate 
with each other. Fig. IA8I compares these measurements for 
each of the fitting proceedures. 



The dashed lines in Fig. IA8I show the ratio 
FWHM/IPV = 1.75 that is applicable for a Gaussian line. 
This line represents a hard limit, and a two-Gaussian model 
cannot have FWHM/IPV > 1.75. The fact that we do see 
scatter over the line is due to the IPV width being measured 
directly from the spectrum, while the FWHM is measured 
from the model fit. 



Fig. \Mia) shows that the FWHM and IPV widths do 
not correlate when using fitting proceedure (1). This pre- 
scription leaves the C iv line with a strong wing which affects 
the IPV width more than the FWHM producing a strong 
skew towards larger IPV widths at any FWHM. 



Fig. IA8f fe) and (c) differ from (a). In these plots the 
IPV widths and FWHMs correlate well with 75 - 80 % of 
the points lying between the lines at FWHM/IPV = 1 and 
1.75. However, both (6) and (c) show a significant number 
of points with low FWHMs in comparison with their IPV 
widths. 

Visual inspection of the spectra of objects with large 
FWHM/IPV ratios reveals that the outlying points repre- 
sent a mix of objects. There are a small number of BAL ob- 
jects in this area which have been missed by our automated 
BAL rejection process (see appendix [BJ . In addition, there 
are a number low S/N spectra in which the double Gaussian 
fit results in a narrow Gaussian being fitted to a noise spike 
in the spectrum which severely narrows the FWHM. Lastly, 
there are a small number of objects which genuinely show 
very peaky profiles with a broad underlying emission and so 
have a smaU FWHM/IPV ratio. 

In light of the number of low S/N objects which have 
FWHMs affected by fits to noise spikes, it seems we are 
allowing too many degrees of freedom in our fitting to low 
S/N objects. In fitting proceedures (2) and (3), where the 
C IV line is symmetric, we also perform a single Gaussian 
fit to each line. We calculate the reduced x^ for each of 
these fits, then take as the best model FWHM that of the 
single Gaussian unless the double Gaussian fit improves the 
reduced y^ by more than one. Fig. IA9I compares the IPV 
width with this best model FWHM. Here we can see that 
the IPV width and best model FWHM correlate strongly. 
There are a small number of outliers with small FWHM/IPV 
ratios, most of which represent truly peaky C iv line profiles. 
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Figure A7. Comparisons of tlie measured line widtii of the C IV line in our whole dataset when applying the three fitting prescriptions 
described in the text. 
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Figure A8. Comparisons between IPV width and FWHM for each of the objects in our sample when applying the three fitting proceedures 
discussed in the text. The solid line gives the l:f relation while the dashed line shows the ratio FWHM/IPV = f .75 appropriate for a 
Gaussian. 



A5 Summary of line fitting techniques 

Of the three Une fitting techniques we have outhned, we 
prefer prescription (2). Prescription (1) resuhs in a large 
red wing on the C iv line and we find no evidence that this 
emission is genuinely associated with C iv. Indeed there are 



examples of spectra which show a dip in emission between 
the C IV line and a bulge of emission redwards of 1600 A. 
Fig. UTOl shows the spectrum of QSO J024634.09-082536.1 
taken as part of the SDSS. The dip in emission to approx- 
imately the continuum level at 1600 A makes it difficult to 
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Figure A9. Comparisons between IPV widths and best model FWHMs for fitting proceedures (2) and (3). The solid line gives the 1:1 
relation while the dashed line shows the ratio FWHM/IPV = 1.75 appropriate for a Gaussian. 
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Figure AlO. Civ region of the SDSS spectrum of object 
J024634.09-082536.1. A continuum has been fitted between --1500 
and 1700 A. The emission redwards of ~1600 A can hardly be con- 
sidered to be associated with the C IV line. 



associate the emission redwards of 1600 A with the C iv line. 
It appears that the FWHM as measured via precription (1) 
gives results similar to that from the other fitting procee- 
dures. However, the IPV widths are strongly affected by the 
red wing of the line. 

Prescription (3) has the advantage of simplicity. How- 
ever, it is overly simplistic to the extent it can produce a 
systematic bias to our results. Furthermore, fitting a linear 
continuum between predetermined points either side of the 
C IV line limits any measurements made on the line to be 
within these limits, and very broad lines could be affected. 
While there are very few lines broad as this we believe pre- 
scription (2) serves better to describe both the C iv emission, 
and the emission surrounding the line. 

In the analysis that follows we use fitting proceedure 
(2). 



A6 Testing the line fitting routine 

Fitting proceedure (2) is relatively complex and we need to 
be confident in our results, in particular for low S/N spec- 
tra. To test the effect of S/N on the accuracy of our fitting 
routine we take the highest S/N spectra from our SDSS and 
2dF samples. We add random Gaussian noise to these spec- 
tra and then re-measure the C iv line width in the degraded 
spectrum. For each original spectrum we add six different 
levels of noise, and repeat the measurement 100 times using 
a different random seed. 

In Fig. lAllI (a) and (c) we compare the average line 
width measured in the 100 degraded spectra with the line 
width measured from the original spectrum for the IPV and 
best model FWHM respectively. In (6) and (d) we compare 
the rms in the 100 line width measurements with the average 
error on the measurements. Fig. lAlll onlv shows results for 
high S/N SDSS spectra, results for 2dF spectra show almost 
identical results. 

Down to a S/N ~ 3A~^ both measures are relatively 
stable with respect to S/N. In general we find that (for 
S/N > 3A~^) the IPV width provides a less biased line 
width measurement, and more accurate errors (<5% offset 
in each at S/N = 3 A~^) when compared to the best model 
FWHM. In the analysis in this paper we will take the IPV 
width as the primary line width measure, and apply a S/N 
cut of S/N > 3 A~^ that rejects ~ 10% of our sample. 



APPENDIX B: BAL REJECTION 

Broad absorption features alter the appearance of emission 
lines and make accurate measurements of the profile im- 
possible. The C IV line is more commonly aff ected by BAL 
syste ms than lower ionisation lines (e.g. Mgll MTrump. et al.l 
I2OO6I ) , and when fitting C iv one must carefully detect and 
reject these QSOs from the analysis. 



Bl Balnicity and Absorption indexes 

The traditional method for measuring the strength of broad 
absorption in quasar spectra is with the balnicity index (BI; 
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Figure All. (a) and (c) compare the average line width measured from 100 noisy spectra with the width measured from the original 
high S/N SDSS spectrum. (6) and (d) compare the average error on these measurements with the rms of the line widths, (a) and (6) 
compare IPV width measurements and (c) and (d) compare the best model FWHM. In each plot we give the results as a function of 

S/N. 



IWevmannet al.l 119911 1. More recently llVump. et al.l (|2006l ') 
made a catalogue of BAL objects in the SDSS DR3 using the 
slightly different absorption index (AI). Both of these indi- 
cators identify broad absorption troughs by comparing with 
fitted template spectra. Regions in a spectrum in which con- 
secutive pixels fall below 90 % of the fitted template, over a 
continuous region exceeding some range in velocity, are con- 
sidered broad absorption lines. The range varies: 2000 and 
1000 km/s for the BI and AI respectively, as do the regions 
in which the search for these absorption troughs is carried 
out: 3000 - 25000 km/s for the BI and - 29000 km/s for the 
AI (for both indexes the BAL search is only carried out blue- 
wards of Civ). Once the broad absorption troughs around a 
line have been identified, the indexes themselves are essen- 
tially the cumulative equivalent width of these troughs. 

From their definition it is clear that the BI is a more 
strict definition of whether an object has broad absorption. 
The AI detects narrower absorbers and searches for them 
over a wider velocity range . The differen c es be tween these 
two indexes are discussed in I Trump, et al.l (|2006| 1 who found 
that, for the C iv line in SDSS DR3 quasars, 10 % had a non- 
zero BI compared with 26 % with an AI. 

We find that both the BI and AI are too strict when 
identifying BAL objects and we have developed our own 
system for rejecting absorption systems. Our process does 
not equa te to a new m ethod for identifying 'definite' BALs 
(see e.g. iKnigge et al.l 12008). but is an automated method 
for identifying spectral lines with absorption features which 



could affect the profile of the line. Indeed many obviously 
narrow-line absorbers are also rejected with our technique. 



B2 BAL identification 

We use two procedures for identifying BALs in our data. 
The first is similar to the BI or AI, the second uses pixel 
binning to search for troughs in the spectra. 



B2.1 Method 1: Consecutive pixels 

We do not fit quasar templates to our spectra as in the 
BI or AI processes. However, we do fit Gaussian models to 
the lines as part of the fitting procedure. We use the best 
Gaussian model in a manner similar to the templates in 
the BI/AI methods. We search for pixels which lie more 
than la below the model within ±2FWIIM of the fit to the 
line. Any spectrum which has consecutive pixels below this 
value spanning > 750 km/s is discarded as a potential BAL 
system. 

The two main differences between our method and the 
BI/AI methods are the use of Icr as the limit to define a 
'low' pixel, and our use of 750 km/s as the width threshold 
for defining broad absorption. We loosen our definition of 
broad absorption to 750 km/s simply to be sure of rejecting 
any spectra which are significantly affected by absorption. 
A deep absorption trough 750 km/s wide can still seriously 
affect line profile measurements and, while these may not 
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represent true BAL systems, they are contaminants to our 
data. 

We use la rather than a fixed percentage of the model 
flux because it has a simpler statistical interpretation. A 
velocity of 750 km/s represents ~ 10 pixels in SDSS spec- 
tra and 3 — 4 pixels in a 2dF spectrum. In a 2dF spectrum 
this criterion represents a > 99 % confidence level for detect- 
ing consecutive pixels which are genuinely deviant from the 
model fit. 

This BAL identification routine is effective at finding 
obvious BAL objects, and BAL systems in high S/N spec- 
tra. However, our dataset contains many objects which ex- 
hibit lower level broad absorption, often at low S/N, which 
are missed by the routine. We miss BALs for two reasons. 
Firstly, our model fit to the emission line is affected by the 
absorption trough, making it less likely to find consecutive 
pixels below the model fit. Second, the necessity for a large 
number of consecutive pixels to lie below the limit is a very 
strict constraint. In objects which show only low level ab- 
sorption, it is likely that one or more of the several pixels 
affected by the absorption will be scattered to within la of 
the model fit by random noise. 

One can relax the criterion for identifying pixels affected 
by absorption. However, if we loosen the criterion too far 
we begin to reject high S/N objects which have systematic 
residuals when compared with the model fit to the emission 
line. 

We require a second method for identifying these low 
level, low S/N BAL systems. 

B2.2 Method 2: Binning the spectrum 

When inspecting a large number of spectra with low level 
broad absorption it became clear that the eye is capable of 
finding BALs when an automated routine struggles for two 
reasons: firstly the eye can automatically smooth a spectrum 
which reduces noise, and secondly it is sensitive to the steep 
sides of an absorption trough. 

We have constructed a second BAL identification rou- 
tine that relies on rebinning the spectrum onto a larger pixel 
scale (i.e. constructing spectra with a larger dispersion in 
terms of A/pix). The rebinned spectrum has a higher S/N 
per pixel than the original and, by comparing a pixel with 
those on either side of it in the rebinned spectrum, we can 
identify whether it has been affected by absorption. 

We rebin the spectrum onto a number of pixel scales be- 
tween 650 and 1000 km/s (~ 10— 15 pixels in SDSS spectra, 
~ 3 — 5 pixels 2dF spectra). For each pixel in the rebinned 
spectrum we interpolate between the two pixels either side. 
If the central pixel falls below the interpolated value by more 
than a given amount we reject the object as a BAL quasar. 
After some experimentation we find that 5.5a makes for a 
good cutoff in SDSS spectra. The lower S/N and dispersion 
of 2dF spectra means that the cutoff level must be reduced 
to 3.5(7 to be sure of detecting BALs. 

We illustrate our second BAL identification scheme in 
Fig.[Bll Fig.ima) shows the Civ region of the SDSS spec- 
trum of object J011229.41+151213.9. In Fig.[Bll;fe) the ab- 
sorption trough at 1530 A is expanded. The points in the 
plot are the original SDSS spectrum. The crosses show three 
points from the rebinned spectrum. The width of each cross 
indicates the bin size and the height shows the error on the 



rebinned fiux density. The dashed line is interpolated be- 
tween the pixels on either side of the central pixel which 
deviates by more than 5.5cr from the line. 

The BAL identification scheme not only works for ob- 
jects such as J011229. 41-1-151213. 9, in which the absorption 
trough dips below the unabsorbed spectrum on either side 
of the trough, but also in BAL spectra where the absorp- 
tion only drops below the unabsorbed spectrum on one side. 
Fig. una) shows the spectrum of J010810. 52-1-001755. 8. 
The blue wing of the C iv line is heavily absorbed. However, 
the absorption is such that the fiux density does not dip sig- 
nificantly below the continuum level on the blue side of the 
absorption trough. Our method of rebinning the spectra to 
search for BALs will also detect the absorption in this spec- 
trum by detecting the sharp increase in flux density around 
1540 A. 

In fact, since most BALs are more than 1000 km/s wide, 
this method of identifying BALs most commonly works by 
finding their steep edges. We do not rebin the spectra onto 
scales greater than 1000 km/s because as we approach the 
width of the emission lines we find that many non-BAL ob- 
jects are also rejected. To stop false identifications of non- 
BAL objects the gradient of the rebinned spectrum must be 
slowly varying across the emission line. As the bin width ap- 
proaches the width of the emission line it no longer becomes 
smooth, and we will begin to identify non-BAL objects. 

High S/N spectra, in particular those with narrower 
emission lines, can be wrongly identified as BAL objects 
with this routine. While the number of objects falsely iden- 
tified as having BALs is small, the resulting bias will be to- 
wards broader line, lower S/N objects. However, these spec- 
tra can be identified and returned to the sample. 

During the fitting procedure we fit both single and dou- 
ble Gaussian profiles to the emission line. Quasar emission 
lines are not well modelled by single Gaussians and, in the 
case of high S/N spectra, they have large reduced x^s. The 
double Gaussian models, however, fit the emission lines rela- 
tively well and have reduced x^s around unity. We find that 
the vast majority of spectra which are wrongly classified as 
having BALs by our routine have a significantly improved 
fit to the double Gaussian model for the emission line com- 
pared to the single Gaussian fit (i.e. the difference between 
the reduced x^s is greater than one). 

Our procedure for rejecting BALs from our sample is 
then as follows. We perform both of our BAL detection rou- 
tines (both searching for consecutive low pixels and the bin- 
ning technique). For the objects which are rejected as hav- 
ing BALs we compare the reduced x^s for single and double 
Gaussian fits to the emission line. All those which have a 
significant improvement in the fit when using two Gaussian 
to model the lines are then visually inspected and any non- 
BAL objects are returned to the sample. 



B3 Testing the BAL rejection code 

To gain an impression of how well our routine works at 
removing BAL objects from our sample, as well as how 
many non-BAL objects may be rejected, we have visually 
inspected 1,000 spectra from the SDSS and identified them 
as BAL or non-BAL objects. 

Of the 1000 objects, 356 were visually identified as hav- 
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Figure Bl. (a) The Civ region of the spectrum of SDSS object J011229. 41+151213. 9. In (b) we expand the region of the spectrum 
around 1530 A. The points show the original spectrum. The crosses show three points in the rebinned spectrum. The width of each cross 
indicates the width of the bin and the height shows the error on the rebinned flux. The dashed hne is interpolated between the pixels on 
either side of the central pixel. 
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Figure B2. SDSS spectrum of quasar J010810. 52+001755. 8 together with an expanded plot of the BAL region and the local rebinned 
spectrum. Symbols in plots are as in Fig. lBll 



ing BALs, of these the automated routine rejects 326. In 
total 358 objects were rejected by the BAL identification 
routine with 32 non-BAL objects rejected because they have 
high S/N narrow lines. In 27 of these false-positive identifi- 
cations the C IV line is significantly better fit with a double 
Gaussian model compared to a single Gaussian and so would 
be returned to our sample after visual inspection. 

After performing our BAL rejection procedure on these 
1,000 objects, 669 are passed by the procedure as not having 
BALs. Included in these 669 objects are 30 unidentified BAL 
systems (4%), and we have rejected 5 non-BAL objects in 
a biased manner (< 1 %). 

It is worth pointing out that our BAL identification pro- 
cess also rejects a significant number of narrow absorption 
objects. In some of these spectra the line profile is relatively 
unaffected by the absorption, and line properties could be 
derived from the spectrum. However, since these are rejected 
in an unbiased manner in terms of the emission line proper- 
ties they are counted as acceptable BAL rejections. 



